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Abstract Body 
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Background / Context: 

Description of prior research and its intellectual context. 

Single-case or single-subject experimental designs (SSED) are used to evaluate the effect of one 
or more treatments on a single case. The basic interrupted time series design has a baseline phase 
consisting of a series of observations prior to treatment introduction, and a treatment phase 
consisting of a series of observations following treatment introduction. 

Although SSED studies are growing in popularity, the results are in theory case-specific. To 
enhance generalizability, researchers can replicate the design across cases, either across studies, 
or within a primary study such as through the application of a multiple-baseline design (MBD), 
including interrupted time-series data from multiple participants where the timing of the 
intervention is staggered across the series. By synthesizing SSED studies’ results, we can 
investigate the overall effect of an intervention, explore the generalizability of this effect, and 
look for factors that moderate the effect. 

One systematic and statistical approach for combining single-case data within and across studies 
is multilevel modeling (Nugent, 1996; Shadish & Rindskopf, 2007; Van den Noortgate & 
Onghena, 2003a, 2003b, 2008). Multilevel models were developed for analyzing clustered data, 
such as data collected from students that can be grouped or are nested in schools. In SSED data 
from multiple cases, such as in a MBD study, we also have a nested structure. Eirst, cases are 
measured repeatedly under different conditions. These measurements are grouped or ‘nested’ in 
subjects. If we have several SSED studies, subjects in turn are grouped in studies, meaning that 
three hierarchical levels can be distinguished. 

Use of the multilevel modeling framework provides an appealing option because it can be used 
to provide estimates of individual treatment effects and how these effects change over time, 
estimates of the average treatment effect and how this effect changes over time, estimates of the 
variability in treatment effects, and estimates of the effects of moderators on the treatment effect 
and on the pattern of a treatment’s effects over time. In addition, the models are flexible enough 
to handle (a) the nesting of observations and of outcomes within cases and the nesting of cases 
within studies, (b) a variety of forms for the growth trajectory within each phase of the design 
(e.g., linear, curvilinear), (c) alternative dependent error structures for the growth trajectories 
(e.g., first order autoregressive, toeplitz), (d) heterogeneous variances (within cases, across cases, 
or across studies), and (e) different types of outcomes (e.g., continuous, count), and (f) 
standardized or unstandardized raw data or effect size measures. 

Purpose / Objective / Research Question / Focus of Study: 

Description of the focus of the research. 

The purpose of the study is to investigate empirically the multilevel approach for 
combining SSED data. More specifically, we aim at assessing the value of the approach for 
numbers of observations, cases and studies that are common in SSED research, by looking at the 
bias and precision of the parameter estimates, and the validity of statistical inferences. Special 
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attention is given to the problem of standardization; if data are not measured on the same scale in 
each of the studies, they should be standardized before being combined. An option to make 
linearly equitable scales with a meaningful zero value comparable, is to divide the scores for 
each case by (an estimate of) the within phase standard deviation. The value of this approach has 
to be investigated yet. 

With the obtained study results, we want to inform applied researchers about possibilities 
and limitations of the use of the basic multilevel model for combining SSED data. At the same 
time, the results will give indications about conditions for which the model or the estimation 
procedures should be further developed (e.g., by using bootstrap procedures instead of maximum 
likelihood procedures). 

Significance / Novelty of study: 

Description of what is missing in previous work and the contribution the study makes. 

Although the multilevel modeling approach and its flexibility are appealing, there is much about 
SSED data and the functioning of multilevel modeling with this type of data and design that is 
not fully understood. Eor example, how well are parameters and standard errors estimated for the 
typical sample sizes encountered in SSED research? What happens to the accuracy of our 
inferences when the model is not correctly specified? How should we standardize our measures 
of effect? How many cases/studies are required to make valid inferences and to have adequate 
power? 

Statistical, Measurement, or Econometric Model: 

Description of the proposed new methods or novel applications of existing methods. 

The model that will be investigated in this study is a multilevel extension of the model of Center, 
Skiba, and Casey (1985-1986). More specifically, the observed scores for case j from study k are 
regressed on a time indicator, T, that is centered around the first observation of the intervention 
phase, a dummy variable for the treatment phase, and an interaction term of these variables: 

^ijk ~ Pojk Pi jk^ijk Pi jk'^ijk^ijk ^ijk 

The equation shows that the expected score in the baseline phase equals + Py^f i^ , while it is 

iPojk + Pijk ) + iP^jk + Pzjk )Pjk the treatment phase. indicates the expected baseline level 
at the start of the treatment phase (when T=0), the linear trend in the baseline scores. The 
coefficient can then be interpreted as the immediate effect of the intervention on the 
outcome, whereas gives an indication of the effect of the intervention on the trend. 

The errors Cyk are assumed to be distributed normally with the covariance matrix a I. 

At the second level of the model, the variation over cases is described using four equations: 
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with u~A^(0,Q„) 



A ) jk ~ ^OOk ^0 jk 

A jk ~ ^wk ■*" jk 

Pi jk ~ Ao/t ^2 jk 

Pi jk ~ Pok ^3 jk 

The first equation indieates that the baseline performanee for subjeet j from study k equals an 
overall baseline performanee for study k, plus a random deviation from this mean; the 
subsequent equations deseribe the variation over subjeets from the same study of the time effeet 
in the baseline eondition, the immediate treatment effeet, and the treatment effeet on the linear 
trend, respeetively. 

At the third level, the variation of the study-speeifie regression eoeffieients from the seeond level 
equations is deseribed; 

With V~/V(0,QJ 

Pok ~ Y^00 ■*" ^Wk 
Pok ~ Y 2 OO ^20k 
Pok ~ Y300~^ ^30k 



Research Design: 

Description of the research design (e.g., qualitative case study, quasi-experimental design, 
secondary analysis, analytic essay, randomized field trial). 

The value of the multilevel approaeh is evaluated by means of a simulation study. More 
speeifieally, we simulated data for MBD studies with starting points of the treatment phase 
staggered aeross the series. 

Based on a study of Shadish and Sullivan (2011) of the eharaeteristies of 809 single-ease designs 
used to assess intervention effeets, a survey of multiple baseline studies by (Perron, Farmer, & 
Owens, 2010), and meta-analyses of SSED data (ineluding, e.g., Alen, Grietens, & Van den 
Noortgate, 2009; Denis, Van den Noortgate, & Maes, 2011; Kokina & Kern, 2010; Shogren, 
Fagella-Luby, Bae, & Wehmeyer, 2004; Swanson & Saehse-Lee, 2000; Wang, Cui, & Parrila, 
2011), we varied the following variables: 

• the series length per ease (7): 10, 20 and 40 

• the number of oases per study (J): 3, 4 and 7 

• the number of studies (K): 10 and 30 

• the immediate treatment effeet: 0 or 2 

• the effeet on the time trend: 0 or .2 

• the between ease varianoe: four diagonal elements of (2, .2, 2, .2), (.5, .05, .5, .05) 
and (8, .08, 8, .08), 

• the between study varianoe: four diagonal elements of Q.y \ (2, .2, 2, .2), (.5, .05, .5, .05) 
and (8, .08, 8, .08), 

• the within ease varianoe: 1 and 5 

• the between-study and between-oase oorrelation in both kinds of effeets: 0 and -.3 
Crossing the levels of the seven faotors leads to a 3x3x2x2x2x3x3x2x2 faotorial design resulting 
in 2,592 simulation oonditions. For eaoh eondition 2,000 data sets are simulated. 
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Data Collection and Analysis: 

Description of the methods for collecting and analyzing data. 

Data are simulated and analyzed in SAS, using the restricted maximum likelihood estimation 
procedure implemented in the procedure MIXED for multilevel or mixed models. 



Findings / Results: 

Description of the main findings with specific details. 

(May not be applicable for Methods submissions) 

The simulation study is currently being performed. Results of a pilot simulation study (with 
slightly other parameters) suggest that if data are not standardized, the estimates of the overall 
immediate treatment effect and the overall effect on the linear trend are almost unbiased. For 
standardized data, however, we find substantial positive bias in the estimates of the mean effects 
that strongly depends on the number of observations per case, as exemplified in Table 1. 

(Please insert Table 1 here) 

As expected, the variance in estimates was higher for standardized than for unstandardized data, 
because estimating the standardizing factor (the within case residual standard deviation) results 
in additional imprecision, especially when the number of observations per case is small. As a 
result of the bias in the mean estimate, the MSE is substantially higher for standardized data if 
the number of observations per case is small. 

Standard errors were found relatively accurate for all conditions, but due to the bias the coverage 
proportion of the confidence intervals can decline to problematic levels when using standardized 
data. This is even more true with an increasing number of cases and/or studies, as illustrated in 
Table 2. 



(Please insert Table 2 here) 

More detailed results will be shown and discussed during the presentation. 

Conclusions: 

Description of conclusions, recommendations, and limitations based on findings. 

We conclude that the use of the multilevel model for combining single-case experimental data 
measured on the same scale yields accurate results, even with a small number of units at either 
level. Standardizing data within cases, however, has a fatal effect on the bias of the effect 
estimates and therefore on the confidence interval coverage proportions, unless the number of 
measurement occasions per case is large, say 50 observations per phase. 

A limitation of the simulation study is that it only looks at the basic multilevel model, this is a 
model that does not account for autocorrelation, nonlinear trends, discrete dependent variables 
and so on. These extensions will be investigated in future research. 
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Table 1: Bias in estimates of the mean immedia te effect 
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Table 2: Coverage proportion of the 90 % confidence intervals for the mean immediate 
effect 
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